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(57) Abstract 

Provided are two plant cDNA clones that 
are homologs of the bacterial CelA genes that 
encode the catalytic subunit of cellulose syn- 
thase, derived from cotton (Gossypium hirsu- 
tum). Also provided are genomic promoter re- 
gions to these encoding regions to cellulose syn- 
thase. Methods for using cellulose synthase in 
cotton fiber and wood quality modification are 
also provided. 
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PLANT CELLULOSE SYNTHASE AND PROMOTER SEQUENCES 

5 INTRODUCTION 

Tgichr'' rai F i e 1 ri 

This invention relates to plant cellulose synthase cDNA 
encoding sequences, and their use in modifying plant 
phenotypes. Methods are provided whereby the sequences can be 

10 used to control or limit the expression of endogenous 
cellulose synthase . 

This invention also relates to methods of using in vitro 
constructed DNA transcription or expression cassettes capable 
of directing fiber-tissue transcription of a DNA sequence of 

IS interest in plants to produce fiber cells having an altered 
phenotype, and to methods of providing for or modifying 
various characteristics of cotton fiber. The invention is 
exemplified by methods of using cotton fiber promoters for 
altering the phenotype of cotton f iber , and cotton f ibers 

2 0 produced by the method . 

Rackground 

In spite of much effort, no one has succeeded in 
isolating and characterizing the enzyme (s) responsible for 

25 synthesis of the major cell wall polymer of plants, cellulose. 

Numerous efforts have been directed toward the study of 
synthesis of ^cellulose ( 1 , 4 -P-D-glucan) in higher plants. 
However, hampered by low rates of activity in vitro, the 
cellulose synthase of plants has resisted purification and 

30 detailed characterization (for reviews, see 1,2) , Aided by 
the discovery of cyclic -di -GMP as a specific activator, the 
cellulose synthase of the bacterium Acetobacter xylinuiu can be 
easily assayed in vitro, has been purified to homogeneity, and 
a catalytic subuni t identified { for reviews , see 2,3). 

3 5 Furthermore , an operon of four genes involved in cellulose 

synthesis in A. xylinum has been cloned (4-7) . 

Characterization of these genes indicates that the first 
gene , termed either 3c s A ( 7 ) or AcsAB ( 6 ) codes for the 8 3 kD 
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subur.ic of the cellulose synthase thai binds the substrate 
UDP-glc and presumably catalyzes the polymerization of glucose 
residues to 1 , 4 -p-D-glucan (8) . The second gene (B) of the 
operon is believed to function as a regulatory subunit binding 
cyclic-di-GMP (9) while recent evidence suggests that the C 
and D genes may code for proteins that form a pore allowing 
secretion of the polymer and control the pattern of 
crystallization of the resulting microfibrils (6) . 




Agrobactarium tumefaciens, have also led to cloning of genes 
involved in cellulose synthesis (10,11), although the proposed 
pathway of synthesis differs in some respects from that of A . 
xylinum . In A. tumefaciens, a CelA gene showing significant 
homology to the BcsA/AcsAB gene of A. xylinum, is proposed to 
15 transfer glc from UDP-glc to a lipid acceptor; other gene 
products may then build up a lipid oligosaccharide that is 
finally polymerized to cellulose by the action of an 
endo-glucanase functioning in a synthetic mode. In addition, 
homologs of the CelA, B, and C genes have been identified in 
20 E. coll, but, as this organism is not known to synthesize 

cellulose in viva, the function of these genes is not clear 
(2) . 

These successes in bacterial systems opened the 
possibility that homologs of the bacterial genes might be 

25 identified in higher plants. However, experments in a number 
of laboratories utilizing the A. xylinum genes as probes for 
screening plant cDNA libraries have failed to identify similar 
plant genes . Such lack of success suggests that , if plants do 
contain homologs of the bacterial genes, their overall 

30 sequence homology is not very high. Recent- studies analyzing 
the conserved mot if s common to glycosyl transferases using 
either UDP-glc or UDP-GlcNAc as substrate suggest that there 
are specific conserved regions that might be expected to be 
found in any plant homolog of the catalytic subunit (referred 

35 to hereafter as CelA) . In one of these studies, Delmer and 
Amor (2) identified a motif common to many such 
glycosyl transferases including the bacterial CelA proteins. 
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An independent analysis (6) alsc concluded than this motif was 
highly conserved in a group of similar glycosyl transferases . 

Extending these studies further, Saxena et al . (12) 
presented an elegant model for the mechanism of catalysis for 
5 enzymes such as cellulose synthase that have the unique 

problem of synthesizing consecutive residues that are rotated 

approximately rotated 180° with respect to each other. The 
model invokes independent UDP-glc binding sites and, based 
upon hydrophobic cluster analysis of these enzymes, the 

10 authors concluded that 3 critical regions in all such 

processive glycosyl transferases each contain a conserved 
aspartate (D) residue, while a fourth region contained a 
conserved QXXRW motif. The first D residue resides in the 
motif as previously analyzed (2,6) . 

15 In general, genetic engineering techniques have been 

directed to modifying the phenotype of individual prokaryotic 
and eukaryotic cells, especially in culture. Plant cells have 
proven more intransigent than other eukaryotic cells, due not 
only to a lack of suitable vector systems but also as a result 

20 of the different goals involved. For many applications, it is 
desirable to be able to control gene expression at a 
part icular stage in the growth of a plant or in a part icular 
plant part. For this purpose, regulatory sequences are 
required which afford the desired initiation of transcript ion 

25 in the appropriate cell types and/or at the appropriate time 
in the plant's development without having serious detrimental 
effects on plant development and productivity. It is 
therefore of interest to be able to isolate sequences which 
can be used to provide the desired regulation of transcription 

3 0 in a plant cell during the growing cycle of the host plant . 

One aspect of this interest is the ability to change the 
phenotype of particular cell types, such as differentiated 
epidermal cells that originate in fiber tissue, i.e. cotton 
fiber cells , so as to provide for altered or improved aspects 

35 of the mature cell type. Cotton is a plant of great 

commercial significance. In addition to the use of cotton 
fiber in the production of textiles, other uses of cotton 
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include food preparation with cotton seed oil and animal feed 
derived from cotton seed husks. 

A related goal involving the control of cell wall and 
characcer ist ics would be to affect valuable secondary tree 
characteristics of wood for paper forestry products. For 
instance, by altering the balance of cellulose and lignin, the 
quality of wood for paper production may be improved. 

Finally, despite the importance of cotton as 



10 has taken place at a relatively slow rate because of the 

absence of reliable promoters for use in selectively effecting 
changes in the phenotype of the fiber. In order to effect the 
desired phenotypic changes, transcription initiation regions 
capable of initiating transcription in fiber cells during 

15 development are desired. Thus, an important goal of cotton 
bioengineer ing research is the acquisition of a reliable 
promoter which would permit expression of a protein 
selectively in cotton fiber to affect such qualities as fiber 
strength, length, color and dyability. 

20 

Relevant Literature 

Cotton fiber-specific promoters are discussed in PCT 
publications WO 94/12014 and WO 95/08914, and John and Crow, 
Proc. Natl. Acad. Sci. USA, 8 9:5769-5773, 1992. cDNA clones 
25 that are preferentially expressed in cotton fiber have been 

isolated. One of the clones isolated corresponds to mRNA and 
protein that are highest during the late primary cell wall and 
early secondary cell wall synthesis stages. John and Crow, 
supra . 

30 In plants, control of cytoskeletal organization is poorly, 

understood in spite of its importance for the regulation of 
patterns of cell division, expansion, and subsequent 
deposition of secondary cell wall polymers. The cotton fiber 
represents an excellent system for studying cytoskeletal 

35 organization. Cotton fibers are single cells in which cell 
elongation and secondary wall deposition can be studied as 
distinct events. These fibers develop synchronously within 
the boll following anthesis, and each fiber cell elongates for 
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about 3 weeks, depositing a thin primary wall (Meinert and 
Delmer, 11984} Plant Physiol. 59: 1088-1097; Basra and Malik, 
(1984) Int Rev of Cytol 39: 65-113) . At the time of 
transition to secondary wall cellulose synthesis, the fiber 
5 cells undergo a synchronous shift in the pattern of cortical 
microtubule and cell wall microfibril alignments, events which 
may be regulated upstream by the organization of actin 
(Seagull, (1990) Protoplasma 159: 44-59; and (1992) In: 
Proceedings of the Cotton Fiber Cellulose Conference, National 

10 Cotton Council of America, Memphis RN, pp 171-192. 

Agrobacterium-mediated cotton transformation is described 
in Umbeck, United States Patents Nos . 5,004,863 and 5,159,135 
and cotton transformation by particle bombardment is reported 
in WO 92/15675, published September 17, 1992. Transformation 

15 of Brassica has been described by Radke et al . (Theor . Appl . 
Genet. (1988) 75;685-694; Plant Cell Reports (1992) 11:499- 
505 . 

Genes involved in iignin biosynthesis are described by 
Dwivedi, U.N., Campbell, W.H., Yu, J., Datla, R.S.S., Chiang, 

20 V.L., and Podila, G.K. (1994) "Modification of lignin 

biosynthesis in transgenic Nicotiana through expression of an 
antisense O-methyl transferase gene from Pojpulus ,r Pi. Mol . 
Biol. 26: 61-71; and Tsai, C.J,, Podila, G.K. and Chaing, V.L. 
(1995) "Nucleotide sequence of Populus tremuloides gene for 

25 caffeic acid/5 hydroxyf erul ic acid O-methyl transferase ' ' Pi. 
Physiol. 107: 1459; and also U.S. Patent No. 5,451,514 
(claiming the use of cinnamyl alcohol dehydrogenase gene in an 
antisense orientation such that the endogenous plant cinnamyl 
alcohol dehydrogenase gene is inhibited) . 

30 
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SUMMARY OF THE INVENTION 

Two cotton genes, CelAl and CelA2 , have been shown to be 

15 highly expressed in developing fibers at the onset of 

secondary wall cellulose synthesis. Comparisons indicate that 
these genes and the rice CelA gene encode polypeptides that 
have three regions of reasonably high homology, both in terms 
of primary amino acid sequence and hydropathy, with bacterial 

20 CelA proteins. The fact that these homologous stretches are 

in the same sequential order as in the bacterial CelA proteins 
and also contain four sub-regions previously predicted to be 
critical for substrate binding and catalysis (12) argues that 
the plant genes encode true homologs of bacterial CelA 

25 proteins. Furthermore, the pattern of expression in fiber as 
well as our demonstration that at least one of these 
highly-conserved regions is critical for UDP-glc binding also 
supports this conclusion. 

Novel DNA promoter sequences are also supplied, and 

30 methods for their use are described for directing 

transcription of a gene of interest in cotton fiber. 

The developing cotton fiber is an excellent system for 
studies on cellulose synthesis as these single cells develop 
synchronously in the boll and, at the end of elongation, 

35 initiate the synthesis of a nearly pure cellulosic cell wall. 
During this transition period, synthesis of other cell wall 
polymers ceases and the rate of cellulose synthesis is 
estimated to rise nearly 100-fold in vivo (13). In our 



SUBSTITUTE SHEET (RULE 26) 



WO 98/18949 



PCTYUS97/19529 



continuing efforts to identify genes critical to this phase of 
fiber development, we have initiated a program sequencing 
randomly selected cDNA clones derived from a library prepared 
from mRNA harvested from fibers at the stage in which 
secondary wall synthesis approaches its maximum rate 
(approximately 21 dpa) . 

We have characterized two cotton {Gossypium hirsutum) 




are homologs of the bacterial CelA genes tl 

10 catalytic subunit of cellulose synthase. Three regions in the 
deduced amino acid sequences of the plant CelA gene products 
are conserved with respect to the proteins encoded by 
bacterial CelA genes. Within these conserved regions are four 
highly conserved subdomains previously suggested to be 

15 critical for catalysis and/or binding of the substrate 

UDP-glc. An overexpressed DNA segment of the cotton CelAl 
gene encodes a polypeptide fragment that spans these domains 
and effectively binds UDP-glc, while a similar fragment having 
one of these domains deleted does not. The plant CelA genes 

20 show little homology at the amino and carboxy terminal regions 
and also contain two internal insertions of sequence, one 
conserved and one hypervariable , that are not found in the 
bacterial gene sequences. Co';r.on CelAl and CelA2 genes are 
expressed at high levels during active secondary wall 

25 cellulose synthesis in the developing fiber. Genomic Southern 
analyses in cotton demonstrate that CelA comprises a family of 
approximately four distinct genes. 

We report here the discovery of two cotton genes that 
show highly-enhanced expression at the time of onset of 

30 secondary wall synthesis in the fiber. The sequences of these 
two cDNA clones, termed CelAl and CelA2, while not identical, 
are highly homologous to each other and to a sequenced rice 
EST clone discovered in the dBEST databank. The deduced 
proteins also share signifigant regions of homology with the 

35 bacterial CelA proteins. Coupled with their high level and 
specificity of expression in fiber at the time of active 
cellulose synthesis, as well as the ability of an E. coli 
expressed fragment of the CelAl gene product to bind UDP-glc, 



SUBSTITUTE SHEET (RULE 26) 

«OOD: <VWD_aei8*WA2JU> 



WO 98/18949 9 PCT/US97/ 19529 

these findings support the conclusion that these plant genes 
are true homoiogs of the bacterial CelA genes . 

The methods of the present invention include transfecting 
a hose plant cell of interest with a transcription or 
expression cassette comprising a cotton fiber promoter and 
generating a plant which is grown to produce fiber having the 
desired phenotype . Constructs and methods of the subject 
invention thus find use in modulation of endogenous fiber 
products, as well as production of exogenous products and in 
modifying the phenotype of fiber and fiber products. The 
constructs also find use as molecular probes. In particular, 
constructs and methods for use in gene expression in cotton 
embryo tissues are considered herein. By these methods, novel 
cotton plants and cotton plant parts, such as modified cotton 
fibers, may be obtained. 

The sequences and constructs of this invention may also 
be used to isolate related cellulose synthase genes from 
forest tree species, for use in transforming and modifying 
wood quality. As and example, lignin, an undesirable by- 
product of the pulping process, by be reduced by over- 
expressing the cellulose synthase product and diverting 
production into cellulose. 

Thus, the application provides constructs and methods of 
use relating to modification of cell and cell wall phenotype 
in cotton fiber and wood products. 

DESCRIPTION OF THE DRAWINGS 

Figure i. Northern analysis of CelAl gene in cotton 
-tissues and developing fiber. Approximately 10/xg total RNA 
from each tissue was loaded per lane. Blots were prepared and 
probe preparation and hybridization conditions were performed 
as described previously (14). The entire CelAl cDNA insert 
was used as a probe in this experiment. Exposure time for 
the audoradiogram was seven hours at -70°. 

Figure 2. Cotton genomic DNA analysis for both the 
CelAl and CelA2 cDNAs . Approximately 10-12/xg of DNA was 
digested with the designated restriction enzymes and 
electrophoresed 0.9% agarose gels. Probe preparation and 
hybridization conditions were as described previously (14). 
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The entire CelAI and CelA2 cDNAs were utlized as probes. 
Exposure time for the audoradiograms was three days at -70°. 

Figure 3 . Multiple alignment of deduced amino acid 
sequences of plant and bacterial CelA proteins. Analyses were 
performed by Clustal Analysis using the Lasergene Multalign 
program (DNAStar, Madison, WI) with gap and gap-length 
penalties of 10 and a PAM250 weight table. Residues are boxed 
and shaded when they show chemical group similarity in 4 out 



wnere homology between plant and bacterial proteins is 
highest. The plant proteins show two insertions that are not 
present in the bacterial protein- -one , P-CR, is conserved 
among the plant CelA genes, while a second insertion is 
hypervariable (HVR) between plant genes. The presence of the 
15 P-CR and HVR regions led to inaccurate alignments when the 
entire proteins were compared; the optimal alignments shown 
here were thus performed in five seperate blocks. Regions 
U-l through U-4 are predicted to be critical for UDP-glc 
binding and catalysis in bacterial CelA proteins; the 
20 predicted critical D residues and QXXRW motif are boxed and 

starred respectively. Potential sites of N-glycosylat ion are 
indicate by -G- . 

Figure 4. Kyte -Dool ittle hydropathy plots of cotton 
CelAI aligned with those of two bacterial CelA proteins. 
25 Alignments and designations are based upon those noted in Fig. 
2. The hydropathy profiles shown were calculated using a 
window of 7, although a window of 19 was used for predictions 
of transmembrane helices that are indicated by the arrows. 

Figure 5. An E. coli expressed GST cotton CelA-1 fusion 
30 protein binds the containing Ul through U4 binds UDP-glc in 

vitro. Panel A shows a hypothetical orientation of the cotton 
CelAI protein in the plasma membrane and indicates the 
cytoplasmic region containing the sub-domains U-l to U-4. 
GST- fusion constructs for CelAI fragments spanning the region 
35 between the potential transmembrane helices (A through H) were 
prepared as described in Materials and Methods. The purified 
and blotted CelAI fusion protein fragments were tested as 
described in Materials and Methods for their ability to bind 
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32 P-UDP-glc (panel B) . M refers to the molecular weight 
markers while CS and £U1 to the full-length and deleted GST- 
CelAl fusion polypeptides. The left panel shows proteins 
stained with Coomassie blue while the other three panels show 
5 representative autoradiograms under different binding 

conditions as described in Materials and Methods. Ph, BSA 
and Ova refer to the molecular weight standards phosphorylase 
b, bovine serum albumin and ovalbumin respectively. 

Figure 6 . Nucleic acid sequences to cDNA of CelAl 
10 protein of cotton {Gossypium hirsutum) . 

Figure 7. Nucleic acid sequences to cDNA of CelA2 
protein of cotton {Gossypium hirsutum) , including 
approximately the last 3 r two-thirds of the encoding region. 

Figure 8 . Genomic nucleic acid sequences of CelAl 
15 protein of cotton (Gossypium hirsutum) , including 

approximately 900 bases of the promoter region 5 1 to the 
encoding sequences . 

DETAILED DESCRIPTION OF THE INVENTION 

20 In accordance with the subject invention, novel 

constructs and methods are described, which may be used 
provide for transcription of a nucleotide sequence of interest 
in cells of a plant host, preferentially in cotton fiber cells 
to produce cotton fiber having an altered color phenotype . 

25 Cotton fiber is a differentiated single epidermal cell of 

the outer integument of the ovule. It has four distinct 
growth phases; initiation, elongation (primary cell wall 
synthesis), secondary cell wall synthesis, and maturation. 
Initiation of fiber development appears to be triggered by 

30 hormones. The primary cell wall is laid down during the 

elongation phase, lasting up to 25 days postanthesis (DPA) . 
Synthesis of the secondary wall commences prior to the 
cessation of the elongation phase and continues to 
approximately 40 DPA, forming a wall of almost pure cellulose. 

35 The constructs for use in such cells may include several 

forms, depending upon the intended use of the construct. 
Thus, the constructs include vectors, transcriptional 
cassettes, expression cassettes and plasmids. The 
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transcript ional and transiat ional initiat ion region (also 
sometimes referred co as a "promoter, ") , preferably comprises 
a transcriptional initiation regulatory region and a 
transiat ional initiation regulatory region of untranslated 5 ' 
5 sequences, "ribosome binding sites," responsible for binding 
mRNA to ribosomes and transiat ional initiation. It is 
preferred that all of the transcriptional and translational 
functional elements of the initiation control region are 




sequences, such as enhancers, or deletions of nonessential 
and/or undesired sequences. By "obtainable" is intended a 
promoter having a DNA sequence suf f iciently s imilar to that of 
a native promoter to provide for the desired specificity of 
15 transcript ion of a DNA sequence of interest. It includes 

natural and synthetic sequences as well as sequences which may 
be a combination of synthetic and natural sequences. 

Cotton fiber transcript ional ini t iat ion regions of 
cellulose synthase are used in cotton fiber modification. 

2 0 A transcriptional cassette for transcription of a 

nucleotide sequence of interest in cotton fiber will include 
■ in the direction of transcription , the cotton fiber 

transcriptional initiation region, a DNA sequence of interest, 
and a transcript ional terminat ion region funct ional in the 
25 plant cell . When the cassette provides 1 for the transcription 
and translation of a DNA sequence of interest it is considered 
an expression cassette. One or more introns may be also be 
present . 

Other sequences may also be present, including those 
3 0 encoding transit peptides and secretory leader sequences as 
desired . 

Downstream from, and under the regulatory control of , the 
cellulose synthase transcript ional /transiat ional initiation 
control region is a nucleotide sequence of interest which 

3 5 provides for modification of the phenotype of f iber . The 

nucleotide sequence may be any open reading frame encoding a 
polypeptide of interest , for example , an enzyme , or a sequence 
complementary to a genomic sequence , where the genomic 
sequence may be an open reading frame , an intron , a noncoding 
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leader sequence, or any other sequence where the complementary- 
sequence inhibits transcription, messenger RNA processing, for 
example, splicing, or translation. The nucleotide sequences 
of this invention may be synthetic, naturally derived, or 
5 combinations thereof. Depending upon the nature of the DNA 
sequence of interest, it may be desirable to synthesize the 
sequence with plant preferred codons . The plant preferred 
codons may be determined from the codons of highest frequency 
in the proteins expressed in the largest amount in the 

10 particular plant species of interest. Phenotypic modification 
can be achieved by modulating production either of an 
endogenous transcription or translation product, for example 
as to the amount, relative distribution, or the like, or an 
exogenous transcription or translation product, for example to 

15 provide for a novel function or products in a transgenic host 
cell or tissue. Of particular interest are DNA sequences 
encoding expression products associated with the development 
of plant fiber, including genes involved in metabolism of 
cytokinins, auxins, ethylene, abscissic acid, and the like. 

20 Methods and compositions for modulating cytokinin expression 
are described in United States Patent No. 5,177,307, which 
disclosure is hereby incorporated by reference. 
Alternatively, various genes, from sources including other 
eukaryotic or prokaryotic cells, including bacteria, such as 

25 those from Agrobacterium tumefaciens T-DNA auxin and cytokinin 
biosynthetic gene products, for example, and mammals, for 
example interferons, may be used. 

Alternatively, the present invention provides the 
sequences to cotton cellulose synthase, which can be 

30 expressed, or down regulated by antisense or co-suppression 
with its own, or other cotton or other fiber promoters to 
modify fiber phenotyp. 

In cotton, primary wall hemicellulose synthesis ceases as 
secondary wall synthesis initiates in the fiber, and there are 

35 only cwo possible P-glucans synthesized in fibers at the time 
these genes are highly-expressed; callose and cellulose (20) . 
The following data strongly argue against the plant CelA genes 
coding for callose synthase: 1) callose synthase binds UDP-glc 
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and is activated in a Ca 2 * -dependent manner (2), while the 
CelAI polypeptide fragment containing the UDP-glc binding site 

preferentially binds UDP-glc in a Mg 2 + -dependent manner, 
similar to bacterial cellulose synthase (9); 2) the timing 
5 of synthesis of callose in vivo in developing cotton fiber 
(20) does noc match the expression of the cotton CelA genes 
(Fig. 1) ; 3) comparison of the CelA gene sequences with those 
of suspected 1,3-P-glucan synthase genes from yeast (21) 




10 It is still possibille that the CelA protein might encode 

both activities, as hypothesized some years ago (22-23), and 
the plant CelAs might be responsible for direct polymerization 
of glucan from UDP-glc as proposed for A. xylinum, although 
they ^ay catalyze synthesis of a lipid-glc precursor as 



15 proposed for the CelA protein of A. tumefaciens . 

In addition to their similarities, the plant CelA genes 
show several very interesting divergences from their bacterial 
ancestors, and these may account for the previous lack of 
success in using bacterial probes to detect these cDNA clones. 

20 However, a BLAST search of protein data banks (24) using the 
entire protein sequence of cotton CelAI always shows highest 
homology with the bacterial cellulose synthases. Of 
particular interest is the insertion of two unique, 
plant-specific regions designated P-CR and HVR . These 

25 regions are clearly not artifacts of cloning as they are 

observed in both cotton genes as well as the rice CelA gene. 
The three plant proteins show a high degree of amino acid 
homology to each other throughout most of their length, 
diverging only at the N- and C- terminal ends and the very 

30 interesting HVR region. It is tempting to speculate that the 
HVR region may confer some specificity of function; the 
highly-charged and cysteine rich nature of the first portion 
of HVR could make this region a potential candidate for 
interaction with specific regulatory proteins, for 

35 cytoskeletal elements, or for redox regulation. In addition, 
we note the presence of several cysteine residues near the bl- 
and C-terminal regions of the protein that might serve as 
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substrates for palmy tolylac ion and also serve to help anchor 
the protein in the membrane (25). • 

In summary, the finding of these plant CelA homologs 
potentially opens up an exciting chapter in research on 
5 cellulose synthesis in higher plants. Their finding is of 
particular significance since biochemical approaches to 
identification of plant cellulose synthase have proven 
exceedingly difficult. One obvious challenge will be to gain 
definitive proof that these genes are truely functional in 

10 cellulose synthesisin vivo. Other promising goals will be to 
identify other components of a complex that might interact 
with CelA, such as that proposed for sucrose synthase (26) , 
and/or a regulatory subunit that binds cyclic-di-GMP (9,27) or 
other glycosyl transferases (10,11) . 

15 Transcriptional cassettes may be used when the 

transcription of an anti -sense sequence is desired. When the 
expression of a polypeptide is desired, expression cassettes 
providing for transcription and translation of the DNA 
sequence of interest will be used. Various changes are of 

20 interest; these changes may include modulation (increase or 
decrease) of formation of particular saccharides, hormones, 
enzymes, or other biological parameters. These also include 
modifying the composition of the final fiber that is changing 
the ratio and/or amounts of water, solids, fiber or sugars. 

25 Other phenotypic properties of interest for modification 

include response to stress, organisms, herbicides, brushing, 
growth regulators, and the like. These results can be 
achieved by providing for reduction of expression of one or 
more endogenous products, particularly an enzyme or cof actor, 

30 either by producing a transcription product which is 

complementary (anti-sense) to the transcription product of a 
native gene, so as to inhibit the maturation and/or expression 
of the transcription product, or by providing for expression 
of a gene, either endogenous or exogenous, to be associated 

35 with the development of a plant fiber. 

The termination region which is employed in the 
expression cassette will be primarily one of convenience, 
since the termination regions appear to be relatively 
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interchangeable. The termination region may be native with 
che transcriptional initiation region, may be native with the 
DNA sequence of interest, may be derived from another source. 
The termination region may be naturally occurring, or wholly 
5 or partially synthetic. Convenient termination regions are 
available from the Ti-plasmid of A . tumefaciens, such as the 
octopine synthase and nopaline synthase termination regions. 
In some embodiments , it may be desired to use the 3 ' 




As described herein , in some instances additional 



nucleotide sequences will be present in the constructs to 
provide for targeting of a particular gene product to specific 
cellular locations . 

15 Similarly, other constitutive promoters may also be 

useful in certain applications , for example the mas , Mac or 
DoubleMac, promoters described in United States Patent No. 
5,106,739 and by Comai et al . , Plant Mol . Biol. (1990) 15:373- 
381). When plants comprising multiple gene constructs are 

20 desired, the plants may be obtained by co- transformation with 
both constructs, or by transformation with individual 
constructs followed by plant breeding methods to obtain plants 
expressing both of the desired genes. 

A variety of techniques are available and known to 

2 5 those skilled in the art for introduct ion of constructs into a 
plant cell host. These techniques include transfection with 
DNA employing A . tumefaciens or A . rhizogenes as the 
transf ect ing agent , protoplast fusion, injection, 
elect roporat ion , particle acceleration, etc . For 

3 0 transf ormat ion with Agrobacterium , plasmids can be prepared in 
E. coli which contain DNA homologous with the Ti-plasmid, 
particularly T - DNA . The plasmid may or may not be capable of 
replication in Agrobacterium, that is, it may or may not have 
a broad spectrum prokaryotic replication system such as does, 

35 for example, pRK2 90, depending in part upon whether the 

transcription cassette is to be integrated into the Ti-plasmid 
or to be retained on an independent plasmid. The 
Agrobacterium host will contain a plasmid having the vir genes 
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necessary for transfer of the T~ DMA to the plant cell and may 
or may not have the complete T-DNA. At least the right border 
and frequently both the right and left borders of the T-DNA of 
the Ti- or Ri-plasmids will be joined as flanking regions to 
5 the transcription construct. The use of T-DNA for 

transformation of plant cells has received extensive study and 
is amply described in EPA Serial No. 120,516, Hoekema, In: The 
Binary Plant Vector System Of f set -drukkeri j Kanters B.V., 
Alblasserdam, 1985, Chapter V, Knauf, et al . , Genetic Analysis 

10 of Host Range Expression by Agrobac terium, In: Molecular 

Genetics of the Bacteria - Plant Interaction, Puhler, A. ed. , 
Springer-Verlag, NY, 1983, p. 245, and An, et al . , EMBO J\ 
(1985) 4:277-284. 

For infection, particle acceleration and electroporation, 

15 a disarmed Ti-plasmid lacking particularly the tumor genes 
found in the T-DNA region) may be introduced into the plant 
cell. By means of a helper plasmid, the construct may be 
transferred to the A. tumefaciens and the resulting 
transfected organism used for transfecting a plant cell; 

20 explants may be cultivated with transformed A. tumefaciens or 
A . rhizogenes to allow for transfer of the transcription 
cassette to the plant cells. Alternatively, to enhance 
integration into the plant genome, terminal repeats of 
transposons may be used as borders in conjunction with a 

25 transposase . In this situation, expression of the transposase 
should be inducible, so that once the transcription construct 
is integrated into the genome, it should be relatively stably 
integrated. Transgenic plant cells are then placed in an 
appropriate selective medium for selection of transgenic cells 

30 which are then grown to callus, shoots grown and plantlets 
generated from the shoot by growing in rooting medium. 

To confirm the presence of the transgenes in transgenic 
cells and plants, a Southern blot analysis can be performed 
using methods known to those skilled in the art. Expression 

35 products of the transgenes can be detected in any of a variety 
of ways, depending upon the nature of the product, and include 
immune assay, enzyme assay or visual inspection, for example 
to detect pigment formation in the appropriate plant part or 



SUBSTITUTE SHEET (RULE 26) 



WO 98/18949 1 8 PCT/US97/19529 

cells. Once cransgenic plants have been obtained, they may be 
grown co produce fiber having the desired phenotype . The 
fibers may be harvested, and/or the seed collected. The seed 
may serve as a source for growing additional plants having the 
desired characteristics. The terms transgenic plants and 
transgenic cells include plants and cells derived from either 
transgenic plants or transgenic cells. 

The various sequences provided herein may be used as 



10 may be useful in the present invention , for example , to obtain 
related transcriptional initiation regions from the same or 
different plant sources. Related transcriptional initiation 
regions obtainable from the sequences provided in this 
invention will show at least about 60% homology, and more 

15 preferred regions will demonstrate an even greater percentage 
of homology with the probes. 

Of particular importance is the ability to obtain related 
transcription initiation control regions having the timing and 
tissue parameters described herein. Thus, by employing the 

20 techniques described in this application, and other techniques 
known in the art (such as Maniatis, et al . , Molecular 
Cloning , - A Laboratory Manual (Cold Spring Harbor, New York) 
1982), other encoding regions or transcription initiation 
regions of cellulose synthase as described in this invention 

25 may be determined. The constructs can also be used in 

conjunction with plant regeneration systems to obtain plant 
cells and plants; thus, the constructs may be used to modify 
the phenotype of fiber cells, to provide cotton fibers which 
are colored as the result of genetic engineering to heretofor 

30 unavailable hues and/or intensities. 

Various varieties and lines of cotton may find use in the 
described methods. Cultivated cotton species include 
Gossypium hirsutum and G . babadense (extra-long stable, or 
Pima cotton) , which evolved in the New World, and the Old 

3 5 World crops G. herbaceum and G. arboreum. 

By using encoding sequences to enzymes which control wood 
quality and wood product characteristics, i.e., cellulose 
synthase and O-methyl transferase (a key enzyme in lignin 
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biosynchesis) the relative synthesis of cellulose and lignin 
by plants may be controlled. Transformation of the plant 
genome with a recombinant gene construct which contains the 
gene specifying an enzyme critical to the synthesis of 
5 cellulose or lignin or a lignin precursor, in either a sense 
or in an antisense orientation. If an antisense orientation, 
the gene will transcribed so mRNA having a sequence 
complementary to the equivalent mRNA transcribed from the 
endogenous gene is expressed, leading to suppression of the 

10 synthesis of lignin or cellulose. 

If the recombinant gene has the lignin enzyme gene in 
normal, or "sense" orientation, increased production of the 
enzyme may occur when the insert is the full length DNA but 
suppression may occur if only a partial sequence is employed. 

15 Furthermore, the expression of one may be increased in 

this manner while the other is reduced. For instance, the 
production of cellulose may by increased through the 
overexpression of cellulose synthase, while lignin production 
is reduced. By thus reducing the relative lignin content, the 

20 quality of wood for paper production would be improved. 



EXAMPLES 

The following examples are offered by way of illustration 
and not by limitation. 
2 5 Flxample 1 

cDNA libra rie s 
An unamplified cDNA library was used to prepare the 
Lambda Uni-Zap vector (Stratagene, LaJolla, CA) using cDNA 
derived from polyA+ mRNA prepared from fibers of Gossypium 
30 hirsutum Acala SJ-2 harvested at 21 DPA, the time at which 
secondary wall cellulose synthesis is approaching a maximal 
rate (13). Approximately 250 plaques were randomly selected 
from the cDNA library, phages purified and plasmids excised 
from the phage vector and transformed. 
35 The resulting clones/insert s were size screened on 0.8% 

agarose gels (DNA inserts below 600bp were excluded) . 



Exampl e ■> 
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T so1^rion and g pqupnring of rHMA Clones 
Plasmid DMA inserts were randomly sequenced using an 
Applied Biosystems (Foscer City, CA) Model 373A DNA sequencer. 
A search of the GenBank EST databank revealed that there were 
at least 23 rice and 8 Arabidopsis EST clones that contain 
sequences similar to the cotton CelAl DNA sequence. EST clone 
S14965 was obtained from Y. Nagamura {Rice Genome Research 
Program, Tsukuba) . A series of deletion mutants were 



Example 3 

Northern and Southern Analyses. 
Cotton plants ( G. hirsutum cv . Coker 130) were grown in 
the greenhouse and tissues harvested at the appropriate times 
indicated and frozen in liquid N2 . Total cotton RNA and 
cotton genomic DNA was prepared and sub j ected to Northern and 
Southern analyses as described previously (14) . 

Example 4 
UDP-Glc Binding Studies 

To construct a GST-CelAl protein fusion, a 1 . 6kb DNA 
CelAl DNA fragment containing a putative cytoplasmic domain 
between the second and third transmembrane helices was PCR 
amplified with the primers ATTGAATTCCTGGGTGTTGGATCAGTT and 
ATTCTCGAGTGGAAGGGATTGAAA in a reaction containing 1 ng plasmid 
DNA (clone 213) as template . The amplified fragment was 
unidirect ionally cloned into the EcoRI and Xhol sites of the 
GST expression vector pGEX4T-3 (Pharmacia), generating a 
fusion protein GST-CS containing the amino acids Ser215 to 
Leu759 of the cotton CelAl protein. Two CelAl gene internal 
PstI sites within the plasmid pGST-CS were used to generate 
the dele t ion mutant pGST -CS AU1 , which lacks 196 amino acids 
(and the Ul binding region) from Val252 to Ala447. 

For the UDGP binding assays, a- 32 P-labeled UDP-glc was 
prepared as described (15). The two fusion proteins GST-CS 
and GST - CSiEUl were expressed in E . coli and purified from 
inclusion bodies (16). Proteins were suspended in sample 
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buffer, heated to 100_ C for S min and approximately 50ng of 
the two fusion protein products and molecular weight standards 
(Bio-Rad) subjected to SDS-PAGE using 4.5% and 7.5% acrylamide 
in the stacking and separating gels, respectively (17). After 
5 electrophoresis, protein transfer to nitrocellulose filters 
was carried out in transfer buffer (25mM Tris, 192mM glycine 
and 20% (v/v) methanol) . The filter was briefly rinsed in 
deionized H2O and incubated in PBS buffer for 15 min, then 
stained with Ponceau-S in PBS buffer. After washing in 

10 deionized H20, protein was further renatured on the filter by 
incubation in PBS buffer for 30 min and used directly for 
binding assays. All binding buffers contained 50mM HEPES/KOH 
(pH 7.3), 50mM MaCl and ImMDTT . In addition, binding buffers 
contained either 5mM MgCl2 and 5mM EGTA (Buffer Mg/EGTA) , 5mM 

15 EDTA (Buffer EDTA) or imM CaCl2 and 20mM cellobiose (Buffer 
Ca/CB) . Binding reaction was carried out in 7ml containing 
32 P-labeled UDP-gic (lx 10 7 cpm) at room temperature for 3 
hours with constant shaking. Filters were washed separately 
three times in 20ml washing buffer consisting of 50mM 

20 HEPES/KOH (pH 7.3) and 50mM NaCl for 5min each, briefly dried 
and analyzed on a Bio-imaging analyzer BAS1000 (Fugi) . 

Example 5 

Tdent 1 f ication, Differential Flyprp^sion ^nd 

2 5 Genomic: Analysis of Cotton PpIA ^nps 

During the course of screening and sequencing random cDNA 
clones from a cotton fiber specific cDNA library prepared from 
RNA collected approximately 21 dpa, it was discovered that two 
cDNA clones that initially exhibited small blocks of amino 
30 acid homology to the proteins encoded by the bacterial CelA 
genes. Clone 213 appeared to be full-length cDNA while 
another distinct clone, 207, appeared to be a partial clone 
relative to the length of 213. These two clones were 
partially homologous at the nucleotide and amino acid levels 
35 and designated CelAl and CelA2 respectively. 

These clones were then utilized as probes for Northern 
blot analysis to determine their differential expression in 
cotton tissues and developing cotton fiber. Figure 1 
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indicates the expression pattern for the CelAl gene. The 
CelAl gene encodes a mRNA of approximately 3.2kb in length and 
is expressed at extremely high levels in developing fiber, 
beginning at approximately 17 dpa , the time at which secondary 
wall cellulose synthesis is initiated ( 13 ) . The gene is also 
expressed at low levels in all other cotton tissues, most 
notably in root, flower and developing seeds. Since regions 
of these genes are somewhat homologous at the nucleotide 




lypervanable regions described in Fig T) co aiscingui; 
specific expression patterns of CelAl and CelA2. These gene 
specific probes generated expression patterns (data not shown) 
for the two genes identical to that shown in Figure 1, except 
that a very low mRNA level was also detected in the primary 
15 wall phase of fiber development (5-14dpa) for the CelA2 gene 
when the blots were overexposed. The CelA2 gene specific 
probe also encoded a 3.2kb mRNA, analogous in size to the mRNA 
specified by the gene for CelAl. Messenger RNAs for both 
genes exhibit a characteristic degradation pattern similar to 
20 other mRNAs specifically expressed late in fiber development 
(J. Pear, unpublished observations) and this degradation is 
not a result of the integrity of the mRNA preparations (14). 
We estimate that both cotton CelA genes are expressed in 
developing fiber approximately 500 times their level of 
25 expression in other cotton tissues and 'that they constitute 
approximately 1-2% of the 24dpa fiber mRNA. 

In order to estimate the number of CelA genes in the 
cotton genome, Southern analysis was performed utilizing both 
CelA cDNAs independently as probes (Fig 2) . Although the two 
30 cotton genes are fairly non- homologous at the nucleotide level 
over their entire length, there are regions of homology (the 
HI, H2 and H3 regions described below) and it was thought 
these regions could be useful in identifying other cotton CelA 
genes. Figure 2 indicates that the CelAl cDNA probe will 
35 hybridize, albeit weakly, to the CelA2 genomic equivalent and 
vise versa. The Hindi II" pattern for both genes and cDNA 
probes is particularly discriminating. There are also a 
number of other weakly hybridzing bands in these digests and 
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from these data we estimate that the cotton CelA genes 
constitute a small family of approximately four genes. 
Homology of Plant and Bacterial CelA Gene Products. 

In addition to the two similar cotton CelA genes, a 
5 homologous cDNA clone was discovered in the dBest databank* of 
rice and Arabidopsis ESTs . Accession No. D48636,. the rice 
clone having the longest insert was obtained and sequenced, 
and the homology comparisons with bacterial proteins reported 
here also include results with the rice CelA. Figure 3 shows 

10 the results of a multiple alignment of the deduced amino acid 
sequences from the three plant CelA genes and four bacterial 
CelA genes from A. xylinum { AcsAB and BcsA) , E. coli , and A . 
tuznefaciens . Figure 4 shows hydropathy plots (18) of cotton 
CelAl similarly aligned with two bacterial CelA proteins and 

15 serves as a more general summary of the overall homologies. 

Of the plant genes, only the cotton CelAl appears to be a 
full-length clone of 3.2kb exhibiting an open reading frame 
that could potentially code for a polypeptide of 109,586 kD, a 
pi of 6.4, and four potential sites of N-glycosylat ion . 

20 Comparison of the N-terminal region of cotton CelAl with 
bacterial genes indicates that the plant protein has an 
extended N-terminal similar in length and hydropathy profile, 
but with only poor amino acid sequence homology to the A. 
tuwefaciens CelA protein. In general, sequence homology of 

25 plant and bacterial genes in both the N-terminal and 

C- terminal regions is poor. However, although overall 
similarity comparing plant to bacterial proteins is less than 
25%, three homologous regions were identified, called H-l, 
H-2, and H-3, where the sequence similarity rises to 50-60% at 

30 the amino acid level. Interspersed between these regions of ■ 
homology are two plant -specif ic regions not found at all in 
the bacterial proteins. Sequences in the first of these 



The following accession numbers were identified as showing homology 
with cotton CelA-1. For rice: D4863S, D41261, D40691, D46824, 
D47622, D47175, 04X166, D41986, D24655, D23732, D24375, D47732, 
D47821, D47850, D47494, D24964, D24862, D24860, D24711, D23841, 
D48053, D48612, D40673; for Arabidopsis: T45303, T45414, H76149, 
H36985, 230729, H36425, T45311, A35212. 
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insertions are highly conserved in the plant genes (P-CR), 
while che second interspersed region seems to be a 
hypervariable regions (HVR) for there is considerable sequence 
divergence among the plant proteins analyzed. 

None of the plant or bacterial CelA proteins contains 
obvious signal sequences even though they are presumably 
transmembrane proteins (4) . However, the overall profiles 
suggest two potential transmembrane helices in the N-terminal 




could anchor the protein in the membrane (see arrows fig. 
also panel A of Fig. 5) . The amino acid sequence positions for 
these predicted transmembrane helices are: A (169-187), B 
(200-218), C (759-777), D (783-801), E (819-837), F (870- 
888), G (903-921), H (933-951). The central portions of the 
15 proteins are more hydrophilic and are predicted to reside in 
the cytoplasm and contain the site(s) of catalysis. More 
detailed inspection of these hydrophilic stretches reveals 
four particularly conserved sub-regions (marked U-l through 
U-4 on Figs. 3-4) that contain the conserved asp (D) residues 
20 (in U-l- 3) and the motif QXXRW (in U-4) that have been 

proposed (12) to be involved in substrate binding and/or 
catalysis . 

Binding of UDP-glucose. Further evidence that the proteins 
encoded by these plant genes are CelA homologs comes from our 

25 demonstration that a DNA segment encoding the central region 
of the cotton CelAl protein, over-expressed in E. coli, binds 
UDP-glc. We subcloned a 1.6 kb fragment of the cotton CelAl 
clone to create a hybrid gene that encodes GST fused to the 
CelAl sequence encoding amino acid residues 215-759 of the 

30 CelAl protein (Fig. 5a). This region spans U-l through U-4 
that are suspected to be critical for UDP-glc binding. As a 
control, another GST fusion was created using a 1.0 kb PstI 
fragment that had the U-l region deleted and might not be 
predicted to bind UDP-glc. The fusion proteins were 

35 overexpressed in E * coli , purifed, and shown to have the 

predicted sizes of approximately 87 and 64 kD, respectively 
(Fig. 5b) . The purified proteins were then subjected to 
SDS-PAGE, and blotted to nitrocellulose. Blotted proteins 
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were renatured, and incubated with ^P-UDP-glc i n order to 
test for binding (Fig. 5b). As predicted, the 87 kD GST-CelAl 

fusion does indeed bind UDP-glc in a Mg 2 + dependent manner, 
while the shorter fusion with the U-l domain deleted did not 
5 show any binding (Although not observed in the experiment 

shown, in some experiments very weak labeling in the presence 
of Ca 2+ could be observed). As further controls, note that 
the molecular weight standards BSA and ovalbumin, proteins 
lacking UDP-glc binding sites, show no interaction with 

10 UDP-glc, while phosphorylase b, an enzyme inhibited by UDP-glc 
(19), binds this substrate. 

Figure 6 provides the encoding sequence to the cDNA to 
celAl (start ATG at ~ base 179), while Figure 7 provides the 
encoding sequence to the approximately two- thirds 3' of the 

15 cDNA to celA2 . 



Example 6 
Genomic DNA 

cDNA for the cellulose synthase clones was used to probe 
20 for genomic clones. For both, full length genomic DNA was 
obtained from a library made using the lambda dash 2 vector 
from Stratagene*£, which was used to construct a genomic DNA 
library from cotton variety Coker 130 {Gossypium hirsutum cv. 
coker 130), using DNA obtained from germinating seedlings. 
25 The cotton genomic library was probed with a cellulose 

synthase probe and genomic phage candidates were identified 
and purified. Figure 8 provides an approximately 1 kb 
sequence of the cellulose synthase promoter region which is 
immediately 5' to the celAl encoding region. The start of the 
30 cellulose synthase enzyme encoding region is at the ATG at 
base number 954 . 



Example 7 

Pot ton Tran.qf o rmat i on 

3 5 Exp] ant Prep^r^t i on 

Promoter constructs comprising the cellulose synthase 
promoter sequences of celAl can be cotton prepared. Coker 315 
seeds are surface disinfected by placing in 50% Clorox (2.5% 
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sodium hypochlorite solution) for 20 minutes and rinsing 3 
times in sterile distilled wacer. Following surface 
sterilization, seeds are germinated in 25 x 150 sterile tubes 
containing 25 mis 1/2 x MS salts: 1/2 x B5 vitamins: 1.5% 
5 glucose: 0.3% gelrite. Seedlings are germinated in the dark 
at 28°C for 7 days. On the seventh day seedlings are placed in 
the light at 28±2°C. 




binary plasmids pCGN2917 and pCGN2926 are transferred to 5 ml 
of MG/L broth and grown overnight at 30°C. Bacteria cultures 
are diluted to 1 x 10 8 cells/ml with MG/L just prior to 
cocult ivat ion . Hypocotyls are excised from eight day old 

15 seedlings, cut into 0.5-0.7 cm sections and placed onto 

tobacco feeder plates (Horsch et al . 1985). Feeder plates are 
prepared one day before use by plating 1.0 ml tobacco 
suspension culture onto a petri plate containing Callus 
Initiation Medium CIM without antibiotics (MS salts: B5 

20 vitamins: 3 % glucose: 0.1 mg/L 2,4-D: 0.1 mg/L kinetin: 0.3% 
gelrite, pH adjusted to 5.8 prior to autoclaving) . A sterile 
filter paper disc (Whatman #1) was placed on top of the feeder 
cells prior to use. After all sections are prepared, each 
section was dipped into an A. tumefaciens culture, blotted on 

25 sterile paper towels . and returned to the tobacco feeder 
plates . 

Following two days of cocult ivat ion on the feeder plates, 
hypocotyl sections are placed on fresh Callus Initiation 
Medium containing 75 mg/L kanamycin and 500 mg/L 

30 carbenicillin . Tissue is incubated at 28±2°C, 30uE 16:8 
light: dark period for 4 weeks. At four weeks the entire 
explant is transferred to fresh callus initiation medium 
containing antibiotics. After two weeks on the second pass, 
the callus is removed from the explants and split between 

35 Callus Initiation Medium and Regeneration Medium (MS salts: 

40mM KNO3 : 10 mM NH4C1:B5 vitamins: 3% glucose: 0.3% gelrite: 400 
mg/L carb: 75 mg/L kanamycin). 
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Embryogenic callus is identified 2-6 months following 
initiation and was subcultured onto fresh regeneration medium. 
Embryos are selected for germination, placed in static liquid 
Embryo Pulsing Medium (Stewart and Hsu medium: 0.01 mg/1 NAA: 
5 0.01 mg/L kinetin: 0.2 mg/L GA3 ) and incubated overnight at 

30°C. The embryos are blotted on paper towels and placed into 
Magenta boxes containing 40 mis of Stewart and Hsu medium 
solidified with Gelrite. Germinating embryos are maintained at 

28±2°C 50 uE m" 2 s _1 16:8 photoperiod . Rooted plantlets are 
10 transferred to soil and established in the greenhouse . 

Cotton growth conditions in growth chambers are as 
follows: 16 hour photoperiod, temperature of approximately BO- 
SS 0 , light intensity of approximately 500/iEinsteins . Cotton 
growth conditions in greenhouses are as follows: 14-16 hour 
15 photoperiod with light intensity of at least 4 OOptEinsteins , 
day temperature 90-95°F, night temperature 70-75°F, relative 
humidity to approximately 80%. 



Plant Analysis 

20 Flowers from greenhouse grown Tl plants are tagged at 

anthesis in the greenhouse. Squares (cotton flower buds), 
flowers, bolls etc. are harvested from these plants at various 
stages of development and assayed for observable phenotype or 
tested for enzyme activity. 



Example 7 
Transformation of Tree. Species 



30 Numerous methods are known to the art for transforming 

forest tree species, for example U.S. Patent No. 5,654,190 
discloses a process for producing transgenic plant belonging 
to the genus Populus, the section Leuce. 

35 The above results demonstrate how the cellulose synthase 

cDNA may be used to alter the phenotype of a transgenic plant 
cell, and how the promoter may be used to modify transgenic 
cotton fiber cells. 
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All publications and patent applications cited in this 
specification are herein incorporated by reference as if each 
individual publication or patent application are specifically 
and individually indicated to be incorporated by reference. 

Although the foregoing invention has been described in 
some detail, by way of illustration and example for purposes 
of clarity and understanding, it will be readily apparent to 



modi £ icat ions may be made thereto, without departing from the 
spirit or scope of the appended claims. 
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CLAIMS 

What is claimed is: 

1. An isolated DNA encoding sequence to a plant 
5 cellulose synthesis enzyme. 

2. The DNA encoding sequence of Claim 1 wherein 
said cellulose synthesis enzyme is cellulose synthase. 

3. The DNA encoding sequence of Claim 2 wherein 
said cellulose synthase is from cotton. 

10 4. The DNA encoding sequence of Claim 3 wherein 

said cotton cellulose synthase is celAl . 

5. The DNA encoding sequence of Claim 4 wherein 
said celAl is encoded by the sequence of Figure 6 . 

6. The DNA encoding sequence of Claim 3 wherein 
15 said cotton cellulose synthase is celA2 . 

7. The DNA encoding sequence of Claim 6 wherein 
said celA2 is encoded by the sequence of Figure 7. 

8 . An isolated DNA encoding sequence to a plant 
cellulose synthesis promoter region. 

20 9. The promoter encoding sequence of Claim 8 

wherein said cellulose synthesis promoter region is to 

cellulose synthase . 

10. The promoter sequence of Claim 9 wherein said 

cellulose synthase promoter region is from cotton. 
25 11. The promoter sequence of Claim l 10 wherein said 

cotton cellulose synthase promoter region is from celAl . 
12. The promoter sequence of Claim 11 wherein said 

cotton cellulose synthase promoter region is the from 

sequence of Figure 8 . 
30 13 . A recombinant DNA construct comprising any of 

the DNA encoding sequences of Claims 1-10. 

14 . The DNA construct of Claim 13 comprising as 
operably joined components in the direction of 
transcription, a cotton fiber transcriptional factor and 

35 the sequence of any of Claims 1-7. 

15. A plant cell comprising a DNA construct of Claims 13 
or 14 . 

16 . A plant comprising a cell of Claim 15 . 
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17. A me c hod of modifying fiber phenotype in a 
cotton plant, said method comprising: 

transforming a plant cell with DMA comprising a 
construct of Claims 13 or 14 . 
5 18. A method of modifying the wood quality 

phenotype in a forest t ree species , said method 
comprising : 



transforming a plant cell of said species with 




cellulose sythesis enzyme is cellulose synthase and 
wherein the encoding sequence is in an antisense 
orientation, wherein transcribed mRNA from said sequence 
is complementary to the equivalent mRNA transcribed from 
15 the endogenous gene, whereby the synthesis of cellulose 
in said plant cell is suppressed. 

20. A method according to Claim 18, wherein said 
cellulose sythesis enzyme is cellulose synthase and 
wherein the encoding sequence is in a sense orientation, 

20 and wherein the synthesis of cellulose in said plant cell 
is increased. 

21. A method according to Claim 20 wherein said 
plant cell additionally comprises a construct encoding a 
sequence to an enzyme involved in the synthesis of 

25 lignin or a lignin precursor. 

22. A method according to Claim 20 wherein said 
lignin encoding sequence is in an antisense orientation, 
wherein transcribed mRNA from said sequence is 
complementary to the equivalent mRNA transcribed from 

30 the endogenous gene, whereby the synthesis of lignin is 
suppressed . 
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1 I 



CGAAA77AACC77CAC7AAAGGGAACA\AAGC7GGAGC: 

>Spei >=lccRI 

>Nocl >XbaI 



I 



I 



>BamHI >Sraal >?scl 
I ! II 

! I t I 



ICO 



27CTAGAAC7AGTGGA7CCCCCGGGC7GCAGGAA77CGGCAG 




200 



AGG AG AAAG AT AAG C AC7 77 77 7 7 G AGA &.7 GR.T 3 j AATCT GGGG 77 C CT G 
T 7 T GC C AC ACTT 3 7 GG7 G AAC A7GT TGGGT TG AAT G 7 T AA7G3 7G AACCT 

300 

T T T G7GG C 7 7 G C C A7 G AA7 G T AAT 7 7 C 7 C7 A7 77 GT AAG AG T T 3T T T TG A 

>Hinc3 >Ndel 
J ^ ^ 

400 

AT GAT G AAAAC C 7 G 7 T GG AC G AT G 7 C G AG AAGGC C ACC GGC GAT C AAT CG 

>ScoRl 
[ 

ACAAT3GCTGCACAT77GAACAAG7C7CAGGA7G77GGAATTCAT GCAAG 

5C0 

ACA7A7CAGCAG7G7G7C7ACA7TGGA7AG7GAAATGGC7GAAGACAA7G 

>tzoKl 
I 

G G AAT T C G A7 7 T GG AA3 AAC AGGG TGGAAAGTT GG AAAG AAAAGAAG AAC 

600 

AAGAAGAAGAAGCCTGCAACAACTAAGG7TGAAAGAGAGGC7GAAA7CCC 
ACC7GAGC^JVC.-- a -A7GGAAGATAAACCGGCACCGGATGCT7CCCAGCCCC 

700 

TCTCGACTATAATTCCAA7CCCGAAAAGCAGACTTGCACCATACCGAACC 

>3cll >5cU 
I I 
G7GA7CATTA7GCGAT7GA7CAT7CT7GC7C7777C77CCA77A7CGAG7 
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I 800 

:TGACAG7GC7T77GGACTG7GGCTCACT7CAG7CA7AT 



G7GAAA7C7GG7TTGCA7T7TCC7GGG7GTTGGAICAG77CCC7AAGTGG 
>Hpal 

I 900 

! 

TATC~3TrAACAGGGAAACATACAITGACAGACTATCTGCAAGATA7GA 

>PstI 

I 

AAGAGAAGG7GAACCTGA7GAAC77GCTGCAG77GAC7TCTTCG7GAG7A 

I 

1 1000 

I 

CAG7C-GA7C:A7TGAAAGAGCC?CCA77GATTACTGCCAATAC7G7GC77 
TCCArrrrrZCCTTGGACTACCCGGTGGATAAGGTCrcrrGrTArATArc 

:::c 

TGArGArGGrGCGGCCATGCrGACArrrGAArCTCTAGTAGAAACA-SCCG 
AC^TTGGA^GA-^GTGGGTTCCArTCrGCAAAA^-ATTTTCCArrGAACCC 

I 

I 1200 

i 

C GGGCAC C T G AG7777 AC77C7 CACAGAAGA77GA7T AC77 GAAAGA7 AA 
AG7GCAGCCC7C777TG7AAAAGAACG7AGAGC7ATGAA«GAGA77A7G 

1300 

AAGAG 7 AC AAAA7T C GAA7 C AATGC 7 7 TAG 77GC AAAGG C 7 C AG AAAACA 

>MscI 

1 

GZrGA.rGAAGGArGGACAATGCAAGArGGAACrTCTTGGCCAGGAA^7>.a 
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I 1400 

I 

CCCGC37GATCACCC7GGCATGA77CAGGT7TTCC7TGGATA7AG7GG7G 

I 

C 7 7 G 7 3 AC A 7 7 G AAGG AAA7 G AAC77 7 7 7 CG AC 7GG 77 7 AC G 7 C 7 77 AG A 

1500 

GAGAAGAGAC77GGCTACCAACACCACAAAAAGGCTGG7GC7GAAAA7GC 
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2C0C 

GGAAA77GACAA77A7GA7GAG7A7GAA i .GA73AA7G77GA7C7G7CAAA 




rAA7GGAGAA7GGAGGAGTGGC7GAA7 
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;AACCCTTCCACACr.-^.T 



CAAGGAAGCAA77CA7G7CATCAGC7G7GGC7ATGAAGAGAAGAC7GCA7 

>ScoR5 
t 

1 2200 
I 

GGGGGAAAGAGA77GGA7GGA7A7A7GG77CAG7CAC7GAGGA7A7377A 

>Clai >SphI 
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AC7GGC77CAAAA7GCACTGCCGAGGA73GAGATCGA7TTAC7GCA7G33 

2300 
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C77AAGGCCAGCA77CAAAGGA7C73CAC7CA7CAA7C7G7CTGA7CCG7 
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AGGCA77GCCC7C7A7GGTATGGC7T7GGAGG7GG7CG7C77AAA7GGC7 
7CAAAGAC7AGCA7A7ATAAACACCA773TC7A7CCTT7CACA7CCCT7C 
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CAC7CA77GCC7AT7G7TCAC7ACCAGCAA7C7GTC77C7CACAGGAAAA 
TT7A7CA7ACCAACGC7C7CAAACCTGGCAAG7G7TC7C777CT7GGCC7 

>Xhoi >SacI 

! I 

! I 2500 

i I 

7T7CC7T7CCAC7A7CG7GACTGCTG77CTCGAGC7CCGA7GGAG7GG7G 



7cagca7tgaggac77a7ggcg7aacgagcag77ttggg7ca7cgg7ggc 

2t:c 

gggc a7tg ac ac c aac 7 77 ac tg 7c ac7gc caaagc agc7 gatga7 gc ag 

>Sacl 
I 

I 2E0C 
i 

A7777GGrGAGC7C7ACA7TGTGAJ^.73GACTACLAC77C7AA7CCC7C::A 



rrcrccGA 



rCAACAAAGGGTACGAAGCTTGGGGACCAC: 



:GGG7CA7CCrCCA7C77TATCCATTCCTC 



w 

A.7GGGACGCCAAAACAGGACACCAACCA77G77G7CC777GG7CAG7G77 
GT7GGC77C7G7C77::7C7CTTG7T7GGG77CGGATCAACCCGT7TG7CA 
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GCACCGCCGA7AGCACCACCGTG7CACAGAGC7GCAT77CCATTGA77G7 

TG A7 G A7 A7 7 A7 G 7 G 7 7 TC 7 T AG AA77 G AAATC AT TGC AAGT AAG7 G 3 AC 

3200 
* 

-CAAACATG7C7A7TGAC7AAG7777GAACAG7T7GTACCCAT777A77C 
TTAGCAG7GTG7AA7777CCTAAACAATGC7A7GAAC7ATACA7A777CA 
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I 3300 
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TTGATArrrACATTAAArGAAACTACATCAGTCTGCAGAAAAAAAAAAAA 
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1 I 
>-?lAAAAAAACTCGAGGGGGGGCCCGGTA 
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I 20 || 40 60 



AACTAGTGGA7CCCCCGGGCTGCAGGAATTCGGC^CGAGCGAGGAGATGGGTTCC3TTTrGTAAGA-^GCA 
TTGATCACCTAGGGGGCCCGACGTCCTTAAGCCGTGCTCGGTCCrCTACCCAAGGCAAAACATTCTTCGT 

>Narl >TthlllI 

I i 

8C | 100 120 | 140 

* j * ★ * * * | ★ 

TAATGTTGAGCCCAGGGCGCCGGAGTTTTATTTCAATGAGAAGATTGATTATTTGAAGGACAAGGTCCAT 
ATTACAAC?GGGGTCCCGCGGCCTCAAAATAAAGTTACrCTTCTAACTAATAAACTTCCTGTTCCAGGTA 

>DraI >Nsil 
I I 
160 180 I 200 | 

x * * •* * I I *• 

CCTAGCTrrGTTAAAGAACGGAGAGCCATGAAAAGGGAATATGAAGAATTTAAAGTAAGGATCAArGCAT 
GGATCGAAACAATTTCTTGCCTCTCGGTACTTTTCCCTTATACTTCTTAAATTTCATTCCTAGTTACGTA 

>Ncol 

i 

220 240 260 | 280 

* * ★ * * j w * 

TAGTAGCAAAAGCTCAGAAGAAACCAGAAGAAGGATGGGTGA.TGCAAGATGGCACCCCATGGCCCGGAAA 
ATCATCGTTT-CGAGTCTTCTTTGGTCTTCTTCCTACCCACTACGTTCTACCGTGGGGTACCGGGCCTTT 

>3cll >ApaLI 

I I 

I 300 320 I 340 

* * * * * I * * 

TAACACTCGTGATCATCCTGGAATGATTCAGGTCTATCTAGGAAGTGCCGGTGCACTCGATGTGGATGGC 
ATTGTGAGCACTAGTAGGACCTTACTAAG7CCAGATAGATCCTTCACGGGCACGTGAGGTACACCTACCG 

360 380 1 400 420 

AAAGAGCTGCCTCGACTTGTCTATGTTTCTCGTGAGAAACGACCTGGTTATCAGCACCArAAGAAAGCCG 
TTTCTCGACGGAGCTGAACAGATACAAAGAGCACTCTTTGCTGGACCAATAGTCGTGG7ATTCTTTCGGC 

>P3tl 
I 

440 | 460 480 

* * * j * * * # 

GTGCTGAGAATGCTCTGGTTCGAGTTTCTGCAGTGCTTACTAATGCACCCTTCATATTGAATC7GGATTQ 
CACGACTCTTACGAGACCAAGCTCAAAGACGTCACGAATGATTACGTGGGAAGTAXAACTTAGACCTAAC 

>BcIl >3amHl 

\ \ 

I 500 520 540 | 560 

| * * * * [ * V 

TGATCATTACATCAA.CAATAGCAAGGGCATGAGGGAAGGGATGTGCTTTTTAATGGATCCTCAGTTTGGA 
ACTAGTAATGTAGTTGTTATCGTTCCGGTACTCCCTTCGCTACACGAAAAATTACCTAGGAGTCAAACCr 

>Hind3 >3spHl >Clal 

1 FIGURE 7 A 1 » 
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I 580 600 ! 1620 

I - * * * I | 

AAGAAGCTTTGTTATGTTCAATTTCCACAGAGA7TTGATGGTATTGATCGTCATGATCGATATG3TAATC 
TTCTTCGAAACAATACAAGTTAAAGGTGTCTCTAAACTACCATAACTAGCAGTACTAGCTATACGATTAG 

>£coR5 



640 



660 



680 



700 



GAAATGTTGTCTTCTTTGATATCAACATGTTGGGATTAGATGGAC7TCAAGGCCCTGTATATGTAGGCAC 
CTTTACAACAGAAGAAACTATAGTTGTACAACCCTAATCTACCTGAAGTTCCGGGACATATACATCCGTG 



>AiwNl 




AGGGTGTGTTTTCAACAGGCAGGCATTGTATGGCTACGATCCACCAGTCTCTGAGAAACGACCAAAGATG 
TCCCACACAAAAGTTGTCCGTCCGTAACATACCGATGCTAGGTGGTCAGAGACTCTTTGCTGGTTTCTAC 



780 



800 



820 



840 



ACATGTGATTGCTGGCCTTCTTGGTGTTGCTGTTGTTGCGGAGGTTCTAGGAAGAAATCAAAGAAGAAAG 
TGTACACTAACGACCGGAAGAACCACAACGACAACAACGCCTCCAAGATCCTTCTTTAGTTTCTTCTTTC 



860 



880 



900 



G7GAAAAGAAGGGCTTACTCGGAGGTCTTTTATACGGAAAAAAGAAGAAGATGATGGGCAAAAACTATGT 
CACTTTTCTTCCCGAATGAGCCTCCAGAAAATATGCCTTTTTTCTTCTTCTACTACCCGTTTTTGATACA 



920 



940 



960 



980 



GAAAAAAGGGTCTGCACCAGTCTTTGATCTCGAAGAAATCGAAGAAGGGCTTGAAGGATACGAAGAATTG 
CTTTTTTCCCAGACGTGGTCAGAAACTAGAGCTTCTTTAGCTTCTTCCCGAACTTCCTATGCTTCTTAAC 



>Asel 
I 



>Xmnl 



I 1000 



>Xmnl 

I 

1020 



1040 



GAGAAATCGACATTAATGTCGCAGAAGAATTTCGAGAAACGATTCGGACAATCACCGGTTTTCATTGCCT 
CTCTTTAGCTGTAATTACAGCGTCTTCTTAAAGCTCTTTGCTAAGCCTGTTAGTGGCCAAAAGTAACGGA 

>Xmnl 



1060 



1080 



1100 



1120 



CAACTTTGATGGAAAATGGTGGCCTTCCTGAAGGAACTAATTCCACATCACTGATTAAAGAGGCCATTCA 
GTTGAAACTACCTTTTACCACCGGAAGGACTTCCTTGATTAAGGTGTAGTGACTAATTTCTCCGGTAAGT 



1140 



1160 



1180 



CGTAATTAGCTGTGGTTATGAAGAAAAAACTGAGTGGGGCAAAGAGATCGGATGGATTTATGGGTCGGTG 
GCATTAATCGACACCAATACTTCTTTTTTGACTCACCCCGTTTCTCTAGCCTACCTAAATACCCAGCCAC 



1200 



>Nsil 
I 

1220 I 



1240 



1260 



ACGGAAGATATATTAACAGGTTTCAAGATGCATTGTAGAGGGTGGAAATCGGTTTATTGTGTACCGAAAA 
TGCCTTCTATATAATTGTCCAAAGTTCTACGTAACATCTCCCACCTTTAGCCAAATAACACATGGCTTTT 



1280 



1300 



1320 
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GACCGGCATTCAAAGGGTCCGCTCCAATCAATCTCTCGGATCGGTTGCACCAAGTTTTGAGATGGGCACT 
CTGGCCGTAAGTTTCCCAGGCGAGGTTAGrTAGAGAGCCTAGCCAACGTGGTTCAAAACTCTACCCGTGA 

1340 1360 1380 1400 

******* 

TGGTTCTGTAGAAATTTTCCrTAGTCGTCACTGTCCACTTTGGTATGGTTATGGTGGAAAACTGAAATGG 
ACCAAGACATCTTTAAAAGGAATCAGCAGTGACAGGTGAAACCATACCAATACCACCTTTTGACTTTACC 

>Aval 

I 

>PaeR7l 
I 

>Xhol 

I 

I 1420 1440 14S0 

I ****** * 

CTCGAGAGGCTTGCTTATATCAACACCATTGTTTACCCTTTCACCTCGATCCCTTTACTCGCCTATTGTA 
GAGCTCTCCGAACGAATATAGTTGTGGTAACAAATGGGAAAGTGGAGCTAGGGAAATGAGCGGATAACAT 

>Pvu2 
I 

1480 1500 1520 1540 

j * * * * * * x 

CTATTCCAGCTGTTTGTCTTCTCACCGGCAAATTCATCATTCCAACTCTAAGCAACCTTACAAGTGTGTG 
GATAAGGTCGACAAACAGAAGAGTGGCCGTTTAAGTAGTAAGGTTGAGATTCGTTGGAATGTTCACACAC 

1560 1580 1600 

******* 

GTTCTTGGCACTTTTCCTCTCCATCATTGCAACTGGAGTGCTTGAACTTCGATGGAGCGGGGTTAGCATC 
CAAGAACCGTGAAAAGGAGAGGTAGTAACGTTGACCTCACGAACTTGAAGCTACCTCGCCCCAATCGTAG 

1620 1640 1660 1680 

* * * * * * * 

CAAGACTGGTGGCGCAATGAACAATTCTGGGTGATCGGAGGTG7CTCCGCCCATCTTTTTGCTGTCTTCC 
GTTCTGACCACCGCGTTACTTGTTAAGACCCACTAGCCTCCACAGAGGCGGGTAGAAAAACGACAGAAGG 

1700 1720 1740 

* * * . * * * * 

AGGGCCTCCTCAAAGTCCTAGCTGGAGTAGACACCAACTTCACCGTAACAGCAAAAGCAGCAGACGATAC 
TCCCGGAGGAGTTTCAGGATCGACCTCATCTGTGGTTGAAGTGGCATHGTCGTTTTCGTCGTCTGCTATG 

>EcoR i 

I 

1 1760 1780 1800 1820 

| ****** * 

AGAATTCGGTGAACTTTATCTCTTCAAATGGACAACTCTCTTAATCCCTCCCACAACTCTGATAATACTG 
TCTTAAGCCACTTGAAATAGAGAAGTTTACCTGTTGAGAGAATTAGGGAGGGTGTTGAGACTATTATGAC 

1840 1860 1880 

* * * * * * * 

AACATGGTCGGAGTCGTGGCCGGAGTTTCAGACGCAATCAACAACGGCTATGGTTCATGGGGTCCATTGT 
TTGTACCAGCCTCAGCACCGGCCTCAAAGTCTGCGTTAGTTGTTGCCGATACCAAGTACCCCAGGTAACA 

1900 .1920 1940 1960 

* * * * * * * 

TCGGCAAAC7GTTCTTCGCATTCTGGGTCATTCTTCATCTTTACCCATTCCTCAAAGGTTTGATGGGGAG 
AGCCGTT7GACAAGAAGCGTAAGACCCAGTAAGAAGTAGAAATGGGTAAGGAGTTTCCAAACTACCCCTC 

>Clal 

I 

1980 2000 | 2020 

* * * * * * * 

FIGURE 7C 
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ACAAAACAGGACGCCCACCArTGrrSTGCrTTGuTCCATACTTTrGGCATCGATTrrCTCACTGGTTTGG 
TGTTTTGTCCTGCGGGTGGTAACAACACGAAACCAGGTATGAAAACCGTAGCTAAAAGAGTGACCAAACZ 

>Clal 
I 

2040 2060 2080 2100 

j * * * * * * «• 

GTACGGArCGATCCCTTCTTGCCCAAACAAACAGGTCCAGTTCTTAAACAATGTGGCGrGGAGTGCTAAA 
CATGCCTAGCTAGGGAAGAACGGGTTTGTTTGTCCAGGTCAAGAATTTGTTACACCGCACCTCACGATTT 

2120 2140 216C 

* * ic * * * -r 

VTTTTATTTTCCCTTTTTGCCACTACTGTTGATTTGCTGTGATTC 




2180 



2200 



2220 



2240 



TAAAAGGGATTTATCTTGTTTGTAAAAAGTCTCCTATGATTTTGTTGGTTCAATTTAATTTCTATATGGT 
ATTTTCCCTAAATAGAACAAACA.TTTTCAGAGGATACTAAAACAACCAAGTTAAATTAAAGATATACCA 



>Sspl 



>DraI 
I 

12260 



2280 



>PaeR7I 
t 

>Aval 
I 

>Xhol 
1 
i 



>As?7l3 
I 

>Apal| >Kcnl 
* I 1 l" 
23001 ( 



AAAAAAATATTTCTTTAAATTAACrATAAAAAAAAAAAAAAAAAACTCGAGGGGGGGCCCGGTACC 
TTTTTTTATAAAGAAATTTAATTGATATTTTTTTTTTTTT-TTTTGAGCTCCCCCCCGGGCCATGG 
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10 2C 30 40 50 6C 

*»**-*■** ***** 

GGGTGATT3ACTAA.^TTTTTAAAA.\TTrTGAAGGTTTTA.^TGAGJ^TTTTT.\AACAATr 
70 80 90 100 110 120 

* * * * * * * * * * TC * 

TTGTATGTTAAACTAAAACTTTCAAAAAAAATTTTGAAAGGTTTAATGAGAATTTTAAAA 

130 140 150 160 170 180 

************* 

ATTTTGAGCGGGCTAATTAAAATTTTTAAAAAATGTATAATAAAAAAATTCAAAAACTCT 

>Apal 
*l 

190 200 | 210 220 230 240 

* * ^-k * ' * * * * * * * 

TTGAGGCCATAAAGGTCATCGGGCCCT7AAATACATCAGCTTGTTGTT7CC7CATATTAC 

>Hpal 
I 

250 I 260 270 280 290 300 

* * *?* */ * * * *■ * * * 

TCATGTTATTTCAG7TAACAGATATAATGGCTATCATTTGATTTAGGAGTGAJLATCTAAA 

>PacI 
1 

310 320 330 340 350 360 

* * * w * *' * * * | * * n 

AATTCGAAAAGTATAAAAACTAAAAAGGATTAAATT^ 

>Hpal 
I 

37C 380 390 400 410 420 

******** * * * * 

TTTACTArrCCAATAACAGAATTTTGAGTTAACAAATTTAACTGCTACAA.rrTGGTTCGA 

>3cll 
I 

430 440 450 4601 470 480 

* * * * * * ye * | * * * * 

GACCAAAATTACAAJ^CCGGAAAAGTATTGGGACTAAAATTGATCAAATTAGAGTACATG 

490 500 510 520 530 540 

************ 

GGTTAAATTCACAA.CTTACTTATGGTACAAGGATTAATAGCATAATTTCTCCTTAGGCAA 

>Hind3 
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550 560 570 1580 590 600 

* * * * * * * | * * * ★ * 

ATGCCAGTTAGTTAAAGATGTACCTTGCCCAACCGAAAGCTTCCTTAAACTTCCCGCAAT 

>Hind3 
I 

610 620 630 640 I 650 66C 

* * * * * * * * | * w * » 

TTTTT AAATTTCTTTTTCCCTTAGA^J^AAAGAACAAAAATGT AAGCTTTGCTTGTCAGAG 

670 630 690 700 71C 720 

*■»***»»**»** 

FIGURE 8A 
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